DeepTriage: Exploring the Effectiveness of Deep Learning for Bug Triaging

نویسندگان

  • Senthil Mani
  • Anush Sankaran
  • Rahul Aralikatte
چکیده

For a given software bug report, identifying an appropriate developer who could potentially fix the bug is the primary task of a bug triaging process. A bug title (summary) and a detailed description is present in most of the bug tracking systems. Automatic bug triaging algorithm can be formulated as a classification problem, which takes the bug title and description as the input, mapping it to one of the available developers (class labels). The major challenge is that the bug description usually contains a combination of free unstructured text, code snippets, and stack trace making the input data highly noisy. The existing bag-of-words (BOW) feature models do not consider the syntactical and sequential word information available in the unstructured text. In this research, we propose a novel bug report representation algorithm using an attention based deep bidirectional recurrent neural network (DBRNN-A) model that learns a syntactic and semantic feature from long word sequences in an unsupervised manner. Instead of BOW features, the DBRNN-A based bug representation is then used for training the classifier. Using an attention mechanism enables the model to learn the context representation over a long word sequence, as in a bug report. To provide a large amount of data to learn the feature learning model, the unfixed bug reports (constitute about 70% bugs in an open source bug tracking system) are leveraged upon as an important contribution of this research, which were completely ignored in the previous studies. Another major contribution is to make this research reproducible by making the source code available and creating a public benchmark dataset of bug reports from three open source bug tracking system: Google Chromium, Mozilla Core, and Mozilla Firefox. For our experiments, we use 383,104 bug reports from Google Chromium, 314,388 bug reports from Mozilla Core, and 162,307 bug reports from Mozilla Firefox. Experimentally we compare our approachwith BOWmodel and softmax classifier, support vector machine, naive Bayes, and cosine distance and observe that DBRNN-A provides a higher rank-10 average accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated, Highly-accurate Bug Triaging Using Machine Learning

Empirical studies indicate that automating the bug assignment process (also known as bug triaging) has the potential to significantly reduce software evolution effort and costs. Prior work has used machine learning techniques to automate bug triaging but has employed a narrow band of tools which can be ineffective in large, long-lived software projects. To redress this situation, in this paper ...

متن کامل

Novel Metrics for Bug Triage

Bug Triaging is a vital part of issue management systems. Bug triaging deals with assigning a developer the task of an incoming bug. This activity is error prone and time consuming if done manually. There is a need for automated support to accelerate this process. The current automated bug triaging systems exploits the text contents of the bug and the tossing relations among the developers. The...

متن کامل

Machine Learning or Information Retrieval Techniques for Bug Triaging: Which is better?

Bugs are the inevitable part of a software system. Nowadays, large software development projects even release beta versions of their products to gather bug reports from users. The collected bug reports are then worked upon by various developers in order to resolve the defects and make the final software product more reliable. The high frequency of incoming bugs makes the bug handling a difficul...

متن کامل

Reviewer recommendation for pull-requests in GitHub: What can we learn from code review and bug assignment?

Context: The pull-based model, widely used in distributed software development, offers an extremely low barrier to entry for potential contributors (anyone can submit of contributions to any project, through pull-requests). Meanwhile, the project’s core team must act as guardians of code quality, ensuring that pull-requests are carefully inspected before being merged into the main development l...

متن کامل

Efficient Bug Triaging Using Text Mining

Large open source software projects receive abundant rates of submitted bug reports. Triaging these incoming reports manually is error-prone and time consuming. The goal of bug triaging is to assign potentially experienced developers to new-coming bug reports. To reduce time and cost of bug triaging, we present an automatic approach to predict a developer with relevant experience to solve the n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1801.01275  شماره 

صفحات  -

تاریخ انتشار 2018